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aaalMMMBt  The  •asl  af  sWea  is  to  recover  physical  properties  of  objects  in  the  scene, 
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i.  INTRODUCTION 

Ftarbtfth  Uoloil^cal  qrstcma  and  macliitinii,  vin<in  bogins  with  a  Urge  and  unwieldy  array 
of  meanuements  of  the  amount  of  Ui^t  reflected  from  surfarcs  in  the  environment.  The 
goal  of  vknon  U  to  rccovtv  {diymcal  properties  of  objects  in  the  scene,  such  as  the  location 
of  object  boundariet  and  the  structure,  color  and  texture  of  object  surfoees,  from  the  two- 
Amcnriooal  image  that  is  lurojectcd  onto  the  eye  or  camera.  This  goal  is  not  achieved 
in  a  fdn^  step;  vudoo  proceeds  in  stages,  witli  each  stage  producing  increasingly  nuwe 
uscfhl  descripthms  of  the  image  and  then  the  scene.  The  first  clues  about  the  physical 
profMarties  of  the  scene  are  provided  by  the  changes  o/  tnfenstty  in  the  image.  For  example, 
in  Figure  1,  the  bonndariea  of  the  sculpture,  the  markings  and  briglit  highlights  on  its 
snffaee,  and  the  shadows  that  the  trees  cast  on  the  snow  all  give  rise  to  iq)atial  changes 
in  tight  intensity.  The  geometrical  struettne,  sharpness  and  contrast  of  these  intensity 
clianges  convey  informatiou  about  the  physical  edges  in  the  scene.  The  importance  of 
intennty  dtanges  and  edges  in  early  visual  procesring  has  led  to  extensive  research  on 
their  detection,  description  and  use,  both  in  computer  and  biological  vision  systems. 
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Figure  1.  A  natnral  image,  exhibiting  intensity  changes  due  to  many  physical  factors 


Tbc  process  of  edge  detection  can  be  divided  into  two  stages:  first,  intensity  changes 
in  the  image  are  detected  and  described;  second,  physical  properties  of  eelges  in  the 
scene  are  inferred  from  this  iiunge  description.  Section  2  concentrates  on  the  first  stage, 
almiit  which  more  is  known  at  this  time.  Section  3  bric'fly  describes  some  areas  of  vision 
research  tliat  a«l<lre8s  the  .s<rond  stage.  This  article  in.iitdy  n-views  .some  of  the  theory 
that  nmlt’rlies  the  •h’tiTl.ioii  of  cslgt's,  and  the  methods  uses!  tf»  ciirry  out  this  ••iindysis. 
Tlu*rc  is  also  some  refcTcnce  to  sttnlies  of  ejurly  pnicessiug  in  hiological  vision  systems. 
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This  nrtirlr  doos  not.  |msK?iit  a  complete  review  of  the  t^lge  detection  literature;  rather 
it  iirtroduccH  the  reader  to  some  of  the  batuc  issues  that  are  considerc<d  central  to  the 
problem  of  edge  detection. 

2.  THE  DETECTION  OF  INTENSITY  CHANGES 

The  most  coiiunonly  used  methods  for  dctccthig  intensity  changes  incorporate  three 
essential  operations.  First,  the  image  intensities  are  either  smoothed  or  approxinuitcd 
locally  by  A  smooth  analytic  function.  Second,  the  smoothed  intensities  arc  differentiated, 
using  either  a  first  or  second  derivative  opt^ration.  Third,  simple  features  in  the  result  of 
this  differentiation  stage,  such  as  peaks  (positive  and  negative  extrema)  or  sero  crossings 
(transitions  between  positive  and  negative  values),  arc  <letecte<l  and  ihiscrihcd.  This 
section  first  describes  briefly  tlw!  n>le  <if  Um'sc?  operations  in  the  dete<-tion  of  intensity 
changes  and  then  presents  in  more  detail,  some  of  the  methods  used  to  carry  out  these 
operations. 

The  smoothing  operation  serves  two  purposes.  First,  it  reduces  the  effect  of  noise 
on  the  detection  of  intensity  changes.  Second,  it  sets  the  resolution  or  scale  at  which 
intensity  changes  are  detcctiKl.  The  sampliitg  and  transduction  of  light  by  the  eye  or 
camera  introduces  spurious  changes  of  light  intensity  that  do  not  corrcsi>ond  to  significant 
physical  changes  in  the  sceme.  Smoothing  of  the  intensities  can  remove  these  minor 
fluctuations  due  to  noise.  Figure  2a  shows  a  one-dimensional  intensity  profile  that  is 
shown  smoothed  by  a  small  amount  in  Figure  2b.  Small  variations  of  intensity,  due 
in  part  to  noise  in  the  digitising  camera,  do  not  appear  in  the  smoothed  intensities. 
Approximation  of  the  intensity  function  by  a  smooth  analytic  function  can  serve  the 
same  purpose  as  a  smoothing  operation. 

Significant  changes  in  the  image  can  also  occur  at  multiple  resolutions.  Consider, 
for  example,  a  leopard’s  coat.  At  a  fine  resolution,  rapid  fluctuations  of  intensity  might 
delineate  the  individual  hairs  of  the  coat,  while  at  a  coarser  resolution,  the  intensity 
changes  might  delineate  only  the  leopard’s  spots.  Changes  at  different  resolutions  can 
often  be  detected  by  smoothing  the  image  intensities  by  different  amounts.  Figure  2c  il¬ 
lustrates  a  more  extensive  smoothing  of  the  intensity  profile  of  Figure  2a,  which  preserves 
only  the  gross  changes  of  intensity. 

The  differentiation  operation  accentuates  intensity  changes  and  transforms  the  im¬ 
age  into  a  representation  from  which  properties  of  these  changes  can  be  extracted  more 
easily.  A  significant  intensity  change  giv*^  rise  to  a  peak  in  the  first  derivative  or  a  zero- 
crossing  in  the  SiH:ond  derivative  of  the  sniooth<*d  intensities,  as  illustrated  in  Figures  2d 
and  2c,  respectively.  These  peaks  or  zero  crossings  can  be  detected  str.aighlforwardly 
mid  properties  such  as  the  position,  .sharpness  and  height  of  the  jie.aks  rapture  the  loca¬ 
tion,  sharpness  iuul  contrast  of  the  int<>nsily  changes  in  the  imag<'.  The  detection  and 
description  of  these  featun-s  in  the  siiu)othed  and  <liirerentiat<'<l  image  provides  a  com- 
p.act  representation  that  captures  iiH*aningful  information  in  the'  image.  Marr  (1)  called 
this  representation  the  Primal  Sketch  of  the  image.  Later  processes,  such  «as  binocular 
sten-o,  motion  measuremi'iit  ainl  texture  mialy.si.s,  whose  goal  is  to  recover  the  physical 
properties  of  the  scene,  may  then  <»perate  lUrectly  on  this  description  of  image  features. 
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Figure  2.  Detecting  Intensity  Changes,  (a)  One  ciimcnsional  intensity  profile;  the  intensities 
along  a  horisontal  scan  line  in  an  image  arc  represented  as  a  graph,  (b)  The  result  of  smoothing 
the  profile  in  (a),  (c)  Tlic  result  of  additional  smoothing  of  (a),  (d)  and  (c)  The  first  and  second 
derivatives,  respectively,  of  the  smoothed  profile  shown  ui  (c).  Tlie  vertical  dashed  lines  indicate 
the  peaks  in  the  first  derivative  and  scrf^-crossings  in  the  second  derivative  that  correspond  to 
two  significant  intensity  changes. 


3.1  THE  ONE-DIMENSIONAL  DETECTION  OF  INTENSITY 
CHANGES 


The  tbcmry  that  underlies  the  detection  of  intensity  changes  in  two-dinienaioual  images 
is  based  heavily  on  the  analysis  of  one -dimensional  siguals.  This  section  discusses  three 
t(q>ks  that  have  been  addressed  in  this  analysis:  (1)  the  design  of  optimal  operators  for 
performing  smoothing  and  differentiation,  (2)  the  information  content  of  the  descrip¬ 
tion  of  signal  features  such  as  scro-  crossings,  and  (3)  the  rchitionship  between  features 
that  are  detected  at  mvltipU  reaolutioTU.  Studies  of  these  issues  have  used  a  variety  of 
tlicoretical  approaches  that  appear  to  yield  similar  conclusions. 

Some  of  the  early  methods  for  detecting  intensity  changes  incorporated  only  limited 
smoothing  of  the  intensities  and  perforiWKl  the  differentiation  liy  taking  first  or  second 
differences  between  neighboring  image  elements  (examples  of  this  early  work  can  be  found 
in  (2  8)).  In  one  dimension,  this  is  equivalent  to  performing  a  convolution  of  the  intensity 
profile  with  operators  of  the  type  shown  on  the  left  in  Figures  3b  and  3c.  Additi<mal 
smoothing  can  be  performed  by  increasing  the  spatial  extent  of  these  operators. 

Tlie  operators  in  Figures  3b  and  3c  contain  step- like  changes.  Other  studies  have 
employed  Gaussian  smoothing  of  the  image  intensities  (fur  example,  0-13).  Combined 
with  the  first  and  second  derivative  operations,  Gaussian  smoothing  yields  convolution 
operators  of  the  type  shown  in  Figures  3d  and  3e.  Several  arguments  have  been  put  forth 
in  support  of  the  use  of  Gaussian  smoothing.  Marr  and  Hildreth  (11,  12)  argued  that  the 
smoothing  function  should  have  both  limited  support  in  space  and  limitcil  bandwidth  in 
frequency.  In  general  terms,  a  limited  support  in  space  is  important  bccau.se  the  physical 
edges  to  be  detected  are  spatially  localised.  A  limited  bandwidth  in  frequency  provides  a 
means  of  restricting  the  range  of  scales  over  which  intensity  changes  are  detected,  which  is 
sometimes  important  in  applications  of  edge  detection.  The  Gaussian  function  minimises 
the  product  of  bandwidths  in  space  and  frequency.  The  use  of  smoothing  functions  that 
do  not  have  limited  bandwidths  in  space  and  frequency  can  sometimes  lead  to  poorer 
performance,  reflected  in  a  greater  sensitivity  to  noise,  the  false  detection  of  edges  that 
do  not  exist,  or  a  poor  ability  to  localise  the  position  of  edges  (set*,  for  example,  11,  14). 

Shanmugam,  Dickey  and  Green  (15)  derived  an  optimal  frequency  domain  filter 
for  detecting  intensity  changes,  using  the  criteria  that  the  filter:  (1)  yields  maximum 
energy  in  the  vicinity  of  an  edge  in  the  image,  (2)  has  limited  frequency  bandwidth, 
(3)  yields  a  small  output  when  the  input  is  constant  or  slowly  varying,  and  (4)  is  an 
even  function  in  space.  For  the  special  case  of  iletecting  step  changes  of  intensity,  the 
optimal  frequency  domain  filter  corresponds  to  a  s]>atial  operator  that  is  approximately 
the  second  derivative  of  a  Gaussian  (for  a  given  bamlwidth)  shown  in  Figure  3c. 

In  a  later  study.  Canny  (14)  used  the  following  criteria  to  derive  an  optimal  operator: 
(1)  good  detection  ability,  that  is,  there  should  be  low  probabilities  «)f  failing  to  detect 
real  edges  and  falsely  detecting  cnlges  that  do  not  exist,  (2)  good  localixation  ability,  that 
is,  the  position  of  the  detected  e«lge  should  b<*  as  close  as  possible'  to  the  true  position 
of  the  eelge,  and  (3)  iiniemeness  of  detection,  that  is,  a  given  edge  should  be  eletccted 
only  once.  The  first  two  criteria  .are  relaU’d  by  an  unce-rfainly  jerinciph';  a.s  dete'ction 
ability  increase's,  le>c.ali7.ation  ability  decreases,  .anel  vice  ve'rsa.  The  an.alysis  alse>  .issumed 


8 


tbat  cxtmna  in  the  output  of  the  opemtor  iiuliratc  tho  prc'Kciicc  of  an  cilgc.  For  the 
partirnUur  cam*  in  whicli  an  “('dge”  iti  defliuHl  aw  a  step  chauige  of  iuteusity,  the  operator 
tfiAt  optimally  satisfies  these  criteria  is  a  linear  eonihinaitiou  of  four  expuuentiails,  which 
can  be  approximated  closely  by  tlie  first  d<‘rivative  of  a  (laussiaii  shown  in  Figure  3d. 

Poggio,  Voorhees  and  Ynille  (1C)  auid  Torre  and  Poggio  (17)  derived  an  optimal 
snmothing  operator,  using  the  tools  of  regularization  theory  from  mathematicad  physics. 
They  began  with  the  obscrvaition  that  numcricad  alifierentiatioii  of  the  image  is  a  math¬ 
ematically  Ul-poaed  problem  (18),  because  its  sadution  docs  not  depend  continuously  on 
the  input  intensities  (this  is  eapiivadeiit  to  saying  that  the  solution  is  not  robust  agauust 
noise).  The  smoothing  operation  servtss  to  regularize  the  image,  madcing  the  differentia¬ 
tion  opwatiou  mathematicadly  well  pos«l.  In  the  cawe  where  the  image  intensities  are 
aissumed  to  contaun  noise,  the  following  method  was  us(>d  to  ra^gidairize  the  image.  First, 
let  I{x)  denote  the  continuous  intensity  fiinctiou,  which  is  saitnplcd  at  a  set  of  discrete 
locations  xjb,  1  <  fc  <  n,  and  lot  S{x)  denote  the  smoothed  intensity  function  to  be 
computed.  It  was  aissumed  that  5(2)  sliould  both  fit  the  sam))I(H]  intensities  aw  closely 
ais  possible  and  be  ais  smooth  ais  possible.  Using  the  tools  of  regularization  theory,  this 
wais  formulated  as  the  computation  of  the  function  S(x)  that  minimizes  the  following 
expression: 

f;(/(x*)-5(**))’  +  A/||5"(x)||*d*. 

fc=i 

The  first  term  measures  how  well  5(x)  fits  the  sampled  intensities  and  the  second  term 
measures  the  smoothness  of  5(x).  The  constauit  A  controls  tlie  trade-off  between  these 
two  measures.  Poggio,  Voorhees  aind  Yuillc  showed  that  the  solution  to  this  minimization 
problem  is  equivadent  to  the  convolution  of  the  image  intensities  with  a  cubic  spline 
that  is  very  similair  to  the  Gaiussion.  Torre  and  Poggio  (17)  further  expanded  upon 
the  theoretical  properties  of  a  broaid  range  of  .smoothing  filters,  from  the  perspective  of 
regularizing  the  image  intensities  for  differentiation. 

Another  approach  to  the  smoothing  stage  is  to  find  am  analytic  function  that  best 
models  or  approximates  the  locad  intensity  pattern.  An  early  representative  of  this  ap¬ 
proach  was  the  Hueckcl  operator  (5,  7).  Surface-fitting  methods  used  a  vairiety  of  basis 
functions  to  perform  the  approximation,  including  plauiar  functions  (19)  and  quavdratic 
functions  (20).  More  recently,  Ilau-adick  (21,  22)  used  the  discrete  Chebychev  polynomi- 
ads  to  approximate  the  image  inta'iisitics.  In  these  methods,  a  differentiation  operation  is 
then  performed  amalyticadly  on  the  polynoraiad  aipproxiinaition  of  the  intc'iisity  function. 
The  method  of  approximation  used  by  Haradick  (21,  22)  is  roughly  equivalent  to  smooth¬ 
ing  the  image  by  convolution  with  spatiad  operators  such  ais  (hose  alerivetl  by  Cranny  (14) 
auid  Poggio,  Voorhetw  amd  Yuillc  (IG).  A  rigorous  conipairison  betwi'cn  the  performance 
of  surfaicc-fitting  versus  direct  smoothing  methoals  haw  not  yet  been  maale. 

A  second  issue  that  bears  on  the  choice  of  oper.itor  for  the  smoothing  aind  differen¬ 
tiation  staiges  is  the  information  content  of  tin*  subsa'apM'nt  descrijition  of  image  feaitures. 
That  is,  to  what  extent  does  a  rejiresa'utation  of  only  the  significaint  chainges  of  intensity 
capture  adl  of  the  inqtortauit  information  in  an  image?  This  (pH'stion  h'd  to  a  number 
of  theoretical  studies  of  the  rea-onstruction  of  ai  signal  fnim  features  such  ais  its  zero 
crossings.  Although  the  goad  of  vision  is  not  to  rccoH.itruct  the  visual  image,  these  results 
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arc  iuiportaiit  bccauRc  they  suggest  tlmt  ati  image  can  be  transformed  into  a  compact 
representation  of  its  features  with  little  loss  of  information. 

An  early  study  by  Logan  (23)  that  interested  many  vision  researchers  addressed  the 
information  content  of  the  sero- crossings  of  a  signal.  Logan  provcnl  that  if  a  signal  has 
(1)  a  frequency  bandwidth  of  less  than  one  octave  and  (2)  no  seros  in  common  with  its 
Hilbert  transform,  then  tlie  signal  can  be  entirely  reconstructed  from  the  positions  of  its 
scro-croesings,  up  to  a  multiplicative  constant.  The  second  condition  is  almost  always 
satisfied  for  physical  signals.  This  result  has  cilso  been  extcndcHl  to  two  dimensions  (1). 
This  analysis  is  interesting,  because  it  sliows  that  the  sero  crossings  of  a  signal  arc  very 
rich  in  information.  Its  direct  relevance  to  vision  is  limited,  however,  because  the  initial 
smoothing  and  difierentiation  of  an  image  is  typically  performed  by  operators  that  are 
not  one-octave  bandpass  in  frequency. 

Other  studies  have  addressed  the  information  content  of  fcatun's  of  signals  that  arc 
more  relevant  to  visual  processing.  For  example,  Yuille  and  Poggio  (24)  proved  some 
interesting  results  regarding  the  scro-  crossings  (or  more  generally,  the  level-  crossings*) 
of  an  image  that  is  convolved  with  the  second  derivative  of  a  Ghussian,  over  a  continuous 
range  of  scales.  Defore  stating  the  results,  we  introduce  the  scale  -apace  representation  of 
zero-crossings  used  by  Witkin  (25),  illustrated  in  Figure  4.  First,  let  the  one- dimensional 
Gaussian  function  be  defined  as  follows,  where  a  is  the  standard  deviation  of  the  Gaus* 
nan: 

G(a:)  = 

<y 

The  second  derivative  of  the  Gaussian  function  is  then  given  by  the  following  expression: 

Suppose  that  a  one-dimensional  signal  I[x)  is  convolved  with  G"{x)  for  a  continuous 
range  of  standard  deviations  u  and  the  positions  of  the  zero-crossings  are  marked  for 
each  size  or  scale.  Figure  4  shows  an  intensity  profile  (Figure  4a)  that  is  convolved 
with  a  G"(x)  function  with  large  a  (Figure  4b).  The  positions  of  the  zero-crossings  are 
marked  with  heavy  dots.  In  the  scale-  space  representation  of  Figure  4c,  the  vertical 
dimension  represents  the  value  of  a  and  the  horizontal  dimension  represents  position  in 
the  signal.  For  each  value  of  <t,  the  positions  of  the  zero -crossings  of  /(x)  *  G"{x)  are 
plotted  as  points  along  a  horizontal  line  in  this  diagram.  For  example,  points  along  the 
dashed  Ihie  at  (7  =  oj  iiidi<'atc  the  positions  of  the  zero- crossings  of  the  sign<d  in  Figure 
4b.  The  8cal«v  space  reprc'scntation  of  zero  crossings  illustrates  the  behavior  of  these 
fcatmres  across  scales.  For  sumll  a,  the  zero  crossings  capture  all  of  the  changes  in  the 
original  intensity  function.  At  coarser  scales  (hu’ger  a),  the  positions  of  the  zero  crossings 
capture  only  the  gross  changes  of  intensity. 

The  scale  space  representation  is  vi.sually  suggestive  of  a  fingerprint.  In  hurt,  in 
much  the  same  way  that  a  fingerprint  uniquely  idt'ntifics  a  person,  the  sc.ale-  space  rep¬ 
resentation  may  uniquely  identify  an  image.  Yuille  and  I’oggio  (24)  prove<l  that  for 
almost  idl  one  diinensioniU  signals,  the  scale  space  map  of  the  zero-  crossings  of  the  sig- 
ind  convolved  with  G"(x)  over  a  continuum  of  scales  determines  the  signal  uniquely,  tip 


*Tlie  level  cro.><f<iii(r..,  <,f  a  si^niU  jir«'  the  jioiiits  at  wliirli  a  value  v  is  crossed  by  the  signal,  where 
V  may  he  non  zero. 


to  a  multiplicative  ronstaiit  aud  an  additional  Iwirinonir  function.  Tlu^  proof  provides 
a  luctbod  for  reconstructing  a  signal  /(z)  from  knowledge  of  how  the  scro-crossings 
of  /(z)  *  G"{x)  change  across  scales.  The  use  of  Gaussian  smoothing  is  critical  to  the 
completeness  of  the  subsequent  feature  representation,  but  the  luisic  theorem  applies  to 
scro-crossings  and  level  crossings  of  the  result  of  applying  any  linear  differential  oper¬ 
ator  to  the  Gaussian-ffltercd  signal.  Yuiilc  and  Poggio  also  derived  a  two-dimensional 
extension  to  this  result. 

Carehil  observation  the  contours  in  the  scale-space  representation  of  Figtirc  4c 
reveals  that  the  contours  either  begin  at  the  smallest  scale  and  continue  as  a  single, 
isolated  contour  through  larger  scales  as  shown  in  Figure  4d(l),  or  they  form  closed, 
inverted  bowl-like  shapes  as  shown  in  Figure  4d(2).  Additional  zero-crossings  arc  never 
created  as  scale  increases;  that  is,  there  arc  no  contours  in  the  scale-space  representation 
of  the  type  shown  in  Figures  4d(3)  and  4d(4).  This  observation  has  been  supported 
by  a  number  of  theoretical  studies  (2C-28),  which  have  also  shown  that  the  Gaussian 
function  is  the  only  smoothing  function  that  yields  this  behavior  of  subscqucjit  features 
across  scale.  This  observation  applies  to  zero- crossings  aud  level-crossings  of  the  result  of 
applying  any  linear  differential  operator  to  the  Gaussian-smoothed  signal.  This  behavior 
of  features  across  scale  has  been  exploited  successfully  in  the  quaUtativc  analysis  of  one- 
dimensional  signals  (25). 

To  summarize,  the  analysis  of  one  -dimensional  signals  has  been  important  for  de¬ 
veloping  a  solid  theoretical  foundation  on  which  to  bfise  methods  for  detecting  intensity 
changes  in  an  image.  Several  theoretical  studies  attempted  to  derive  an  optimal  operator 
for  detecting  intensity  changes,  using  a  variety  of  criteria  for  evaluating  the  performance 
of  the  operator.  All  of  these  operators  essentially  perform  a  smoothing  and  differentiation 
of  the  image  intensities.  Fiu-thermore,  the  one-dimensional  analyses  all  point  to  opera¬ 
tors  whose  spatial  shape  is  roughly  the  Hrst  or  second  derivative  of  a  Gaussian  function. 
Mathematical  studies  also  addressed  the  information  content  of  representations  of  image 
features  and  the;  behavior  of  these  fcatiires  across  multiple  scales.  These  latter  studies 
also  stressed  the  importance  of  Gaussian  smoothing.*  Interestingly,  the  initial  filters  in 
the  human  visual  system  also  appear  to  perforin  a  spatial  convolution  of  the  imago  with 
a  fiinction  that  is  closely  approximated  by  the  second  derivative  of  a  Gaussian  (29).  It  is 
also  well  known  that  the  huinim  visual  system  initially  analyzes  the  retinal  image  through 
a  number  of  spatial  filters  that  differ  in  the  amount  of  smoothing  that  is  performed  in 
space  and  in  time  (29). 

2.2  THE  TWO-DIMENSIONAL  DETECTION  OF  INTENSITY 
CHANGES 

Tlic  problems  that  wore  a<hlresscd  in  the  one  dimensional  analysis  of  intensity  signals 
also  arise  for  the  detection  of  intensity  changes  in  two  dimensional  images,  although 
their  solution  is  more  complex.  The  <lcsign  of  optiimil  operators  for  performing  the 

*It  should  he  noted  again  that  some  edge  detertioii  methods  tliat  perforin  <ui  nnalytie  approx- 
iuiation  of  the  intensity  function  may  he  eipiivalent  to  tiiose  ]ierforming  a  direct  smoothing 
operation  with  a  GaH.ssiaii  function. 


■moothing  and  diffcrcntiatHm  stages,  for  example,  is  complicated  by  a  larger  selection 
ai  possible  dmivative  operations  that  can  be  performed  in  two  dimensions.  Mc-my  of  the 
mathematical  results  regarding  the  information  content  of  inuige  fc<aturcs  and  behavior 
of  features  across  scale  have  been  extended  hi  two  dimensions,  but  the  algorithms  for 
extracting  and  descrilnng  these  features  in  the  image  are  also  more  complex  than  their 
one- dimensional  counterparts.  This  section  reviews  some  of  the  techniques  used  to  detect 
and  describe  intensity  dianges  in  two-dimensional  images. 

Early  work  on  edge  detection  primarily  used  directional  first  and  second  derivative 
operators  for  performing  the  two-dimensional  dificrentiation  (2-10,  10,  20,  30-32).  A 
change  of  intensity  tliat  is  extended  along  some  orientation  in  the  image  gives  rise  to  a 
peak  in  the  first  derivative  of  intensity  taken  in  the  direction  perpendicular  to  the  orien¬ 
tation  of  the  intensity  change,  or  a  zero  crossing  in  the  second  directional  derivative.  The 
simplest  directional  operators  arc  formed  by  extending  one-  dimensional  cross-sections 
such  as  those  shown  in  Figure  3  along  some  two  dimensional  direction  in  the  image.  Di¬ 
rectional  operators  have  differed  in  the  shape  of  their  cross-sections,  both  perpendicular 
to  and  along  their  primary  orientations.  M<u;lcod  (0)  and  Marr  and  Poggio  (10),  for 
example,  used  directional  derivatives  that  embodied  Gaussian  smoothing. 

In  principle,  the  computation  of  the  derivatives  in  two  directions,  such  as  the  hori¬ 
zontal  and  vertical  directions,  is  sufficient  to  detect  intensity  changes  at  all  orientations 
in  the  image.  Several  algorithms,  however,  use  directional  operators  at  a  large  number  of 
discrete  orientations  (for  example,  see  (4,  7,  8,  14,  32)).  A  given  intensity  change  is  de¬ 
tected  by  a  number  of  directional  operators  in  this  case  and  the  output  of  the  directional 
operator  that  yields  the  largest  response  is  typically  used  to  describe  the  local  intensity 
change.  Two  examples  of  algorithms  of  this  type  arc  those  of  Nevatia  and  Babu  (32)  and 
Canny  (14).  An  example  of  the  results  of  Canny’s  algorithm  is  shown  in  Figure  5.  The 
contours  of  Figure  5b  represent  only  the  positions  of  the  significant  intensity  changes  in 
Figure  5a. 

Other  related  differential  operators  that  ore  used  in  two  dimensions  arc  the  first  and 
second  derivatives  in  the  direction  of  the  gradient  of  intensity  (14,  17,  22).  The  intensity 
gradient,  defined  as  follows: 

V»/  =  (^  ^) 

is  a  vector  that  indicates  the  direction  and  magnitude  of  steepest  increase  in  the  two- 
dimensional  intensity  function.  Let  n  denote  the  unit  vector  in  the  direction  of  the 
gradient.  The  differential  operators  ^  and  are  non-dircctional  operators,  in  the 
sense  that  their  value  does  not  change  when  the  image  is  rotated.  They  are  also  nonlinear 
operators,  cmd  unlike  the  linear  differential  opcrfitors,  cannot  be  combined  with  the 
smoothing  function  in  a  single  filtering  step.  Methods  such  as  those  of  Nevatia  and 
Babu  (32)  and  Canny  (14)  essentially  use  the  directional  derivative  along  the  gradient 
for  extracting  features. 

A  second  non  directional  operator  that  is  used  for  detecting  intensity  changes  is  the 
Laplacian  operator,  V*  (1,  5,  11-13,  15,  33): 


"Try:-’-*,  a  ;\'a\ 


Figure  5.  Canny’s  Edge  Detection  Algorithm,  (a)  A  natural  image,  (b)  The  positions 
intensity  changes  <l<>t<‘ctrd  by  Canny’s  algorithm.  (CouTlf,iiy  oj  J.  F.  Canny. j 
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Combined  with  a  two-'^iiiciisioiial  Oauaaian  smoothing  function, 

G{r)  =  ^e-^, 

the  Lcplacian  yields  the  fonction  given  by  the  following  expression: 

r  denotes  the  distance  from  the  center  of  the  operator  and  a  is  the  standard  deviation  of 
the  two-dhncnsi<nial  Gaumian.  The  V^G  function  is  shaped  somotliing  like  a  mexican  hat 
in  two  dimensions.  Figure  6  shows  an  example  of  the  convolution  of  an  image  (Figure  6a) 
with  a  operatm*  (Figure  6b).  The  Laplacian  is  a  non-dircctiunal  second  derivative 
(q>cratiou;  the  elements  in  the  output  of  the  Laplacian  that  correspond  to  the  location 
of  intensity  changes  in  tiic  image  arc  therefore  the  sero-crossings.  The  scro-crossing 
contoiurs  derived  from  Figure  6b  arc  shown  in  Figure  6c.  In  this  case,  the  scro-crossing 
contours  were  located  by  detecting  the  transitions  between  positive  and  negative  values 
in  the  filtered  image,  by  scanning  in  the  horisontal  and  vertical  directions.*  A  single 
convolutiem  of  the  image  with  the  non-directioual  operator  allows  the  detection  of 
intensity  changes  at  all  orientations,  for  a  given  scale.  The  two-  dimensional  orientation 
of  a  local  portion  of  the  scro-crossing  contour  can  be  computed  from  the  gradient  of  the 
filtered  image  (12). 

It  is  not  yet  clear  whether  directional  or  non-directional  operators  arc  most  appro¬ 
priate  for  detecting  intensity  changes.  Both  have  advantages  and  disadvantages.  The 
use  of  the  Laplacian  is  simpler  and  requires  less  computation  than  the  use  of  either 
directional  derivatives  or  derivatives  in  the  direction  of  the  gradient.  The  directional  op¬ 
erators,  however,  yield  somewhat  better  localisation  of  the  position  of  intensity  changes 
(14,  22),  particularly  in  areas  where  the  orientation  of  an  edge  is  changing  rapidly  in  the 
image  (34,  35).  Features  such  as  the  sero-crossing  contours,  when  derived  with  non- 
directional  operators,  generally  form  smooth,  closed  contours,  while  features  obtained 
with  directional  operators  generally  do  not  have  such  special  geometric  properties  (17). 
Marr  and  Hildreth  (11)  showed  that  if  the  intensity  function  along  the  orientation  of 
an  intensity  change  varies  at  most  linearly,  then  the  sero-crossings  of  the  Laplacian  ex¬ 
actly  coincide  with  the  sero-crossings  of  a  directional  operator  taken  in  the  direction 
perpendicular  to  the  orientation  of  the  intensity  change.  Torre  and  Poggio  (17)  charac¬ 
terised  more  formally,  the  relationship  between  the  seros  of  the  Laplacian  and  those  of 
the  second  derivative  in  the  direction  of  the  gradient,  in  terms  of  the  geometry  of  the 
two-dimensional  intensity  surface.  With  regard  to  the  use  of  directional  versus  non- 
directional  derivative  operators,  it  is  interesting  to  note  that  physiological  studies  reveal 
that  the  retina  analyzes  the  visual  image  through  a  circularly  symmetric  filter  whose 
spatial  sliape  is  given  by  the  difference  of  two  Gaussian  functions  (set;,  for  example,  36, 
37),  which  is  closely  approximated  by  the  V^G  fiinction. 

Mathematical  results  regarding  the  information  content  and  behavior  across  scales 
of  image  features  have  .some  bearing  on  the  choice  of  <IiffcrentiaI  oi)erators.  For  exam¬ 
ple,  Yuillc  and  Poggio  (28)  showcfl  that  in  two  dimensions,  the  combination  of  Gaussian 

*Thr  design  of  robust  methods  for  detecting  zero  crossings  remains  an  open  area  of  research 
in  edge  detection. 
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•mooUung  with  any  linear  fifforcntial  operator  yields  scro -crossings  or  level  crossings 
that  behave  wdl  with  iuercMUg  scale,  in  that  no  features  arc  created  as  the  sisc  of  the 
rtBiMUMAii  is  increased.  In  the  case  of  the  second  derivative  along  the  gradient,  Yuille 
and  Poggio  proved  that  thcR  is  no  smoothing  function  that  avoids  the  creation  of  sero- 
crossingB  with  increasing  scale.  The  completeness  of  the  scale- space  representation  of 
lero  -crosinngs  or  levtd-croanngs  in  two  dimensimis  also  requires  the  use  of  linear  differ¬ 
ential  (qMsratm  (24). 

The  analysis  of  intensitf  changes  across  multiple  scales  is  a  difficult  problem  that  has 
not  yet  found  a  satisfactory  solution.  There  is  a  clear  need  to  detect  intensity  changes. at 
multiple  resolutions  (2).  IngKirtant  physical  changes  in  the  scene  take  place  at  different 
scales.  Spatial  filters  that  allow  the  description  of  fine  detail  in  the  intensity  function 
generally  miss  coarser  structures  in  the  image,  while  those  tliat  allow  the  extraction  of 
coarser  feattires  generally  smooth  out  important  detail.  At  all  resolutions,  some  of  the 
detected  features  may  not  oerrespond  to  real  physical  changes  in  the  scene.  For  example, 
at  the  finest  resolutions,  some  of  the  detected  intensity  changes  may  be  a  consequence  of 
noise  in  the  sensing  process.  At  coarser  resolutions,  spurious  image  features  might  arise  as 
a  consequence  of  smoothing  together  nearby  intensity  changes.  The  problems  of  sorting 
out  the  relevant  changes  at  each  resolution  and  combining  them  into  a  representation 
that  can  be  used  effectively  by  later  processes  arc  difficult  and  unsolved  problems.  We 
mention  here  some  of  the  research  that  has  attempted  to  address  these  problems. 

Morr  and  Hildreth  (11)  explored  the  combination  of  zero  crossing  descriptions  that 
arise  from  convolving  an  image  with  opesrators  of  different  sisc.  An  example  of 
these  descriptions  is  illustrated  in  Figure  7.  The  scro-crossings  from  the  smaller  V’G 
operator  primarily  detect  the  bumpy  texture  on  the  surface  of  the  leaf,  whereas  the  scro- 
crossing  contours  from  the  larger  operator  also  outline  some  of  the  higliiights  on  the  leaf 
surface  that  are  due  to  changing  illumination  (the  arrows  point  to  one  example).  Marr 
and  Hildreth  suggested  the  use  of  spatial  eoincidenet  of  scro-crossings  across  scale  as  a 
means  of  indicating  the  presence  of  a  real  edge  in  the  scene.  Strong  edges  such  as  object 
boundaries  often  give  rise  to  sharp  intensity  changes  in  the  image  that  are  detected  across 
a  range  of  scales  and  in  roughly  the  same  location  in  the  image.  In  the  one-dimensional 
scale-space  representation*,  these  edges  give  rise  to  roughly  vertical  lines.  The  existence 
of  contours  in  the  scale-space  representation  that  are  roughly  vertical  and  extend  across 
a  range  of  scidcs  could  be  used  to  infer  the  presence  of  a  significant  physical  change  at 
the  corresponding  location  in  the  scene. 

Witkin  (25)  developed  a  method  for  constructing  (jualitative  descriptions  of  one- 
dimensional  signals  that  uses  the  scale-space  representation.  The  method  embodied  two 
basic  assumptions:  (1)  the  identity  assumption,  that  zero  crossings  detected  at  different 
scales,  wliich  lie  on  a  common  contour  in  the  scale  space  description,  arise  from  a  single 
physical  event;  and  (2)  the  localization  assumption,  that  the  trut'  location  of  a  physical 
event  that  gives  rise  to  a  contour  in  the  s<'alc-8pace  description  is  the  contour’s  position 
as  a  tends  to  zero.  Coarser  scale's  were  used  to  identify  important  events  in  the  signal  and 
finer  scales  used  to  localize  their  position.  Events  that  persisted  over  large  changes  in  scale 

*Thc  scale*  'Space  representation  can  be  extended  to  two  ditncnsioiis,  in  wtiidi  the  positions  of 
the  zero- crossings  on  the  x  —  y  plane  ore  represcntcfl  across  multiple  operator  sizes. 


Figure  7.  Multiple  Operator  Sir.es.  (a)  A  natural  image,  (li)  and  (c)  The  /.ero-crossings 
that  result  from  convolving  the  image  with  operators  whose  central  positive  region  has  a 
diameter  of  6  and  12  image  elements,  respectively.  The  arrows  in  (a)  and  (c)  indicate  a  highlight 
in  the  image  that  is  detected  by  the  larger  operator. 


also  had  Hprcial  nguifiranco.  Witkin’H  uu-tliud,  callml  scale  -  apace  JUtering,  begins  with 
the  scale  spare  descripliBn  and  collapses  it  into  a  discrete  tree  stmrtiin?  that  n'prescsits 
the  qualitative  behavior  of  the  signal.  Some  of  the  heuristics  embodied  in  tliis  analysis 
may  be  useful  for  an^yiing  two  dimensional  images. 

Canny  (14)  used  a  different  approach  to  combining  descriptions  of  intensity  changes 
across  multiple  scales,  ftatures  were  first  detected  at  a  set  of  discrete  scales.  The  finest 
scale  description  was  then  ustnl  to  predict  the  restdts  of  the  next  larger  scale,  assuming 
that  the  filter  used  to  derive  the  larger  scale  description  performs  atlditionnl  smoothing  of 
the  image.  In  a  partictilar  area  of  the  image,  if  there  was  a  substsmtial  difference  between 
the  actual  description  at  the  larger  scale  mid  that  predicted  by  the  snialler  scale,  then  it 
was  assumcMl  tlmt  there  is  an  important  change  taking  pbute  at  the  larger  scale  that  is  not 
detcHrted  at  tlie  finer  scale.  In  tliis  case,  features  detected  at  the  larger  scale  were  then 
added  to  the  final  feature  ivprosentation.  Empirically,  Canny  found  that  most  features 
were  detected  at  the  finest  scale  and  rcbatively  few  were  added  from  coarser  scales. 

Poggio,  Voorhccs  and  Yuille  (16)  have  also  begun  to  explore  the  issue  of  detect¬ 
ing  intensity  changes  across  scales,  using  the  methods  of  rcgiilarisation  theory.  Recall 
that  their  approach  was  to  find  a  smoothed  intensity  function  S{x),  given  the  sampled 
intensities  I{xk),  which  minimizes  the  following  expression: 

'£{I(xk)-S{xk))^  +  \ 

The  parameter  A  controls  the  scale  at  which  intensity  changes  are  detected.  That  is,  if  A 
is  small,  S(x)  closely  approximates  I{xk),  cond  as  A  increases,  S(x)  becomes  increasingly 
more  smooth.  Regularization  theory  may  suggest  methods  for  choosing  the  optimal  A 
for  a  given  set  of  data,  which  may  be  useful  for  cmalyzing  changes  across  multiple  scales 
(18). 

To  summarize,  there  has  been  considerable  progress  on  the  detection  and  description 
of  intensity  changes  in  two-dimensional  images,  but  there  still  exists  many  open  ques¬ 
tions.  A  large  body  of  theoretical  and  empirical  work  has  addressed  the  question  of  what 
operators  are  most  appropriate  for  performing  the  smoothing  and  differentiation  stages. 
Emerging  from  this  work  is  a  better  understanding  of  the  advantages  and  disadvantages 
of  various  operators  and  the  relationship  between  alternative  approaches.  It  is  unlikely 
that  a  single  method  will  be  most  appropriate  for  all  tasks.  The  choice  of  operators 
depends  in  p<art  on  the  application,  the  nature  of  the  later  processes  that  use  tlie  descrip¬ 
tion  of  image  features,  and  the  available  computational  resources.  Some  interesting  work 
has  begun  to  address  the  problem  of  detecting  and  integrating  intensity  changes  across 
multiple  scales,  but  a  satisfactory  solution  to  this  problem  still  eludes  vision  researchers. 
A  problem  that  was  not  di.scussed  here  is  the  computation  of  properties  such  as  contrast 
and  sharpness  of  the  intensity  changes.  There  has  beini  some  work  on  this  problem,  but 
it  has  not  yet  reetdved  a  rigorous  analytic  treatment. 


3.  RECOVERING  PROPERTIES  OF  THE  PHYSICAL  WORLD 

In  the  uitroduction,  it  was  note<l  that  the  goal  of  vision  is  to  recover  th<'  physical  proper¬ 
ties  of  objects  in  the  scene,  such  as  the  lociition  of  objex-t  boumharies  ;uid  the  structure. 
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color  and  texture  of  object  surfaces,  from  Uio  two-  dimensional  image  that  is  projected 
onto  the  eye  or  camera.  The  detection  of  intt^usity  changes  in  the  image  represents  oplyi 
a  first,  meager  step  toward  achieving  this  goal.  This  section  briefly  mentions  some  of  the 
areas  td  vision  that  address  the  recovery  of  physical  propertitn  of  edges  in  the  seeqe. 

fwoperty  of  edges  that  is  perhaps  most  mportant  and  most  studied  thw 
three-dimensional  structure.  The  structure  of  edges  is  conveyed  through  many  sources. 
Ibr  example,  the  relative  locations  of  corresponding  edges  in  left  and  right  stereo  views 
conveys  inf<»matiun  about  the  location  of  the  edges  in  thrce-dimrasional  spacis,  'and 
rdativc  movement  between  ed^  iii  the  irha^  can  be  used  to  assess  their  relative  position 
in  space.  Three-dimensional  structure  can  also  be  inferred  from  the  shape  of  the  two- 
dimcnsitmal  projection  of  edge  contours,  the  way  in  which  edges  intersect  in  the  image, 
and  variations  in  surface  texture.  These  latter  cues  are  essential  in  the  interpretation  of 
structure  from  a  single,  static  photograph.  Many  algorithms  that  analyse  these  sources 
are  feaiure-hased,  in  that  the  initial  inferences  regarding  three-dimensional  structure 
are  made  at  the  locations  of  features  such  as  significant  intensity  changes  in  the  image. 
Discussion  of  some  of  these  processes  for  recovering  three  -dimensional  structure  can  be 
found,  for  example,  in  (1,  5,  7,  10,  13,  27,  30,  31,  38-40). 

Another  important  property  of  edges  is  the  type  of  physical  change  from  which  they 
arise.  For  example,  edges  miglit  be  the  consequence  of  object  boundaries,  changes  in 
surface  orientation,  shadows,  highlights  or  light  sources,  surface  markings,  changes  in 
surface  reflectance  or  material  composition,  and  so  on.  Ultimately,  it  is  necessary  to 
determine  the  physical  source  of  each  edge  in  the  scene.  Wliilc  some  interesting  work  has 
been  done  in  these  areas,  there  remain  many  open  problems  (examples  can  be  found  in 
(1,  5,  7,  13,  30,  31,  38-41)).  The  recovery  of  these  physical  properties  of  edges  is  likely 
to  be  a  main  focus  of  future  research  on  edge  detection. 
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